Consistency of Surrogate Risk Minimization Methods for Binary Classification using Classification Calibrated Losses
نویسندگان
چکیده
In the previous lecture, we saw that for a λ−strongly proper composite loss ψ, it is possible to bound the 0 − 1 regret in terms of its ψ−regret. Hence, for λ−strongly proper composite loss ψ, if we have a ψ− consistent algorithm, we can use it to obtain a 0 − 1 consistent algorithm. However, not all loss functions used as surrogates in binary classification are proper, the hinge loss being one such example.
منابع مشابه
Consistency of structured output learning with missing labels
In this paper we study statistical consistency of partial losses suitable for learning structured output predictors from examples containing missing labels. We provide sufficient conditions on data generating distribution which admit to prove that the expected risk of the structured predictor learned by minimizing the partial loss converges to the optimal Bayes risk defined by an associated com...
متن کاملCalibrated Surrogate Losses for Classification with Label-Dependent Costs
We present surrogate regret bounds for arbitrary surrogate losses in the context of binary classification with label-dependent costs. Such bounds relate a classifier’s risk, assessed with respect to a surrogate loss, to its cost-sensitive classification risk. Two approaches to surrogate regret bounds are developed. The first is a direct generalization of Bartlett et al. [2006], who focus on mar...
متن کاملClassification Methods with Reject Option Based on Convex Risk Minimization
In this paper, we investigate the problem of binary classification with a reject option in which one can withhold the decision of classifying an observation at a cost lower than that of misclassification. Since the natural loss function is non-convex so that empirical risk minimization easily becomes infeasible, the paper proposes minimizing convex risks based on surrogate convex loss functions...
متن کاملChapter 11 Surrogate Risk Consistency : the Classification Case
I. The setting: supervised prediction problem (a) Have data coming in pairs (X,Y ) and a loss L : R×Y → R (can have more general losses) (b) Often, it is hard to minimize L (for example, if L is non-convex), so we use a surrogate φ (c) We would like to compare the risks of functions f : X → R: Rφ(f) := E[φ(f(X), Y )] and R(f) := E[L(f(X), Y )] In particular, when does minimizing the surrogate g...
متن کاملConsistency of Surrogate Risk Minimization Methods for Binary Classification using Strongly Proper Losses
We learnt that under certain conditions on weights, a weighted-average plug-in classifier (or any learning algorithm that outputs such a classifier for the same training sample) is universally Bayes consistent w.r.t 0-1 loss. One might wonder for what other learning algorithms can similar statements be made. Can some of the other commonly studied/used learning algorithms be shown to be Bayes co...
متن کامل